The "Corpus of Interactional Data" (CID) - Multimodal annotation of conversational speech"

نویسندگان

  • Roxane Bertrand
  • Philippe Blache
  • Robert Espesser
  • Gaëlle Ferré
  • Christine Meunier
  • Béatrice Priego-Valverde
  • Stéphane Rauzy
چکیده

The understanding of language mechanisms needs to take into account very precisely the interaction between all the different domains or modalities, which implies the constitution and the development of resources. We describe here the CID (Corpus of Interactional Data), an audio-video corpus in French recorded and processed at the Laboratoire Parole et Langage (LPL). The corpus has been annotated in a multimodal perspective including phonetics, prosody, morphology, syntax, discourse and gesture studies. The first results of our studies on the CID lead to confirm the relevance of an analysis which takes into account as many linguistic fields as possible to draw up a more precise knowledge of discourse phenomena. MOTS-CLÉS : schéma d'encodage multimodal, outils et plate-forme d'annotation, phonétique, prosodie, morphologie, syntaxe, discours, geste.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The OTIM Formal Annotation Model: A Preliminary Step before Annotation Scheme

Large annotation projects, typically those addressing the question of multimodal annotation in which many different kinds of informationhave to be encoded, have to elaborate precise and high level annotation schemes. Doing this requires first to define the structure of theinformation: the different objects and their organization. This stage has to be as much independent as possible ...

متن کامل

Naïve listeners’ perception of prominence and boundary in French spontaneous speech

Our main goal here is to explore the link between naïve listeners’ perception of prominences and boundaries in spontaneous speech and experts’ annotation of prosodic hierarchy and accentuation in French. We first present the design of our corpus, which consists in 133 utterances extracted from the Corpus of Interactional Data (CID). 73 naïve listeners judged prominences and boundaries using thr...

متن کامل

A quantitative view of feedback lexical markers in conversational French

This paper presents a quantitative description of the lexical items used for linguistic feedback in the Corpus of Interactional Data (CID). The paper includes the raw figures for feedback lexical item as well as more detailed figures concerning interindividual variability. This effort is a first step before a broader analysis including more discourse situations and featuring communicative funct...

متن کامل

Automatic analysis of multiparty meetings

This paper is about the recognition and interpretation of multiparty meetings captured as audio, video and other signals. This is a challenging task since the meetings consist of spontaneous and conversational interactions between a number of participants: it is a multimodal, multiparty, multistream problem. We discuss the capture and annotation of the AMI meeting corpus, the development of a m...

متن کامل

Automatic detection of other-repetition occurrences: application to French conversational Speech

This paper investigates the discursive phenomenon called other-repetitions (OR), particularly in the context of spontaneous French dialogues. It focuses on their automatic detection and characterization. A method is proposed to retrieve automatically OR: this detection is based on rules that are applied on the lexical material only. This automatic detection process has been used to label other-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TAL

دوره 49  شماره 

صفحات  -

تاریخ انتشار 2008